Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Show correct disk size #14595

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

hamistao
Copy link
Contributor

@hamistao hamistao commented Dec 6, 2024

This is a follow up PR for #14511 and this aims to address the problems outlined in #14511 (comment).

To do this, we need a way to determine wether the instance volume had its quota set based on volume.size on the pool level. For this, we are introducing the volatile.rootfs.size config key for instance volumes, previously it was only recognized in image volumes. This config key is used to represent the current quota of an instance volume at any point in time. This way we can determine whether a volume had its quota set based on volume.size or not.

Also, any changes to the disk only cascade onto the volume's volatile.rootfs.size when the quota is applied on the volume. Thus, we are able to handle instances that don't support online resizing as the field we are looking for will only change when a quota is applied on the volume, i.e. when the instance restarts.

This key was chosen instead of the size key because the intended behavior of this config key fits better the pattern we see with volatile.* keys, that can be changed by LXD itself, without the user specifying them.

@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Dec 6, 2024
doc/api-extensions.md Outdated Show resolved Hide resolved
doc/rest-api.yaml Outdated Show resolved Hide resolved
lxd/storage/backend_lxd.go Outdated Show resolved Hide resolved
@hamistao hamistao force-pushed the disk_size_fixes branch 2 times, most recently from 2a2e63c to 7c08121 Compare December 9, 2024 06:17
@hamistao hamistao marked this pull request as ready for review December 12, 2024 04:47
@hamistao hamistao requested a review from tomponline December 12, 2024 05:20
@tomponline tomponline changed the title Follow up on showing correct disk size Storage: Show correct disk size Dec 12, 2024
@tomponline
Copy link
Member

For this, we are introducing the volatile.rootfs.size config key for instance volumes

Is this only for container filesystem based volumes or for VM block volumes too?

Is this key used for VM image volumes too?

@hamistao
Copy link
Contributor Author

Is this only for container filesystem based volumes or for VM block volumes too?

This is for all instance types, since all can be restrained by setting the device's size key.

Is this key used for VM image volumes too?

Yes, it is currently only used for image volumes.

@tomponline
Copy link
Member

Yes, it is currently only used for image volumes.

thats not what i asked, is it currently populated for VM image volumes as well as container image volumes?

@hamistao
Copy link
Contributor Author

thats not what i asked, is it currently populated for VM image volumes as well as container image volumes?

Oh sorry, I misread your question. Yes, it is used for both container and VM images.

@hamistao hamistao marked this pull request as draft December 12, 2024 17:18
@tomponline
Copy link
Member

@hamistao as discussed we will revert/continue to return 0 for volume size for both instance state and volume state when a volume is unbounded, rather than -1.

@tomponline
Copy link
Member

@hamistao there's a comment in validateVolumeCommonRules that says:

// volatile.rootfs.size is only used for image volumes.

Does this need to be updated/considered/changed?

@tomponline
Copy link
Member

@hamistao please can you explore whether we can prevent volatile.rootfs.size being editable for volumes that have it set, as a separate PR.

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the changes previously discussed.

@github-actions github-actions bot removed Documentation Documentation needs updating API Changes to the REST API labels Jan 14, 2025
@hamistao hamistao force-pushed the disk_size_fixes branch 3 times, most recently from ebd1bee to 8084432 Compare January 14, 2025 07:51
@hamistao
Copy link
Contributor Author

We have reached a crossroads when it comes to what value to use to represent the size of unbounded (sizeless) volumes.

Previously, we got size 0 in every scenario except when a user explicitely set the volume size, either through lxc config device set c1 root size=10GiB, for instance volumes, or lxc storage volume set pool custom_vol size 10GiB. Besides disregarding the size that block-based volumes need to have, this also doesn't consider the fact that block-based instances are not live-resizable, and such getting a non-zero could also be innacurate. Also, when getting usage was not supported, the API used to show an empty state struct for the instance disk, so no value was designed for usage nor total.

Mind that these 0's were never exposed though CLI command outputs, so lxc storage volume info pool volume, lxc info instance and lxc config show instance would simply abstain from mentioning the disk/volume total size.

Since #14511, we now show -1 for the total size of sizeless instances (not custom volumes yet) and -1 for usage when it is not supported. We also now show the default block size for block-based instances (10GiB). Custom volumes still behave the same as they did previously.

So now we have two possible ways to think of the size of sizeless volumes.
A) The 0 size for sizeless volumes we got previously was not a bug, but only the innacurate reporting of the size. So we should change it back to 0.
B) We never should show 0 as a size for a volume since the correct way to represent no size is by using -1. So showing 0 was never correct. So we should fix custom volumes accordingly.

One one hand, option A implies we should be "changing less" and also takes into account the possibility of some users relying on this 0 to check wether the instance has a root disk size set or not. It is also more flexible in case we want to make it a uunsigned integet in the future.
On the other, B implies we should be "changing more" and using -1. This is more consisten with our API. As examples, we have cpu usage and project limits defaulting to -1 on the API level. Also, although changing to -1 can be seen as an API break, the disk size reported by the API was extremely unreliable, so if one was almost always getting 0 size, it is hard to say that this is meaningful/useful information to begin with. And this is why I am more inclined to go with B and keep it as -1.

I hope this was clear, if any info is missing we can discuss it in our 1-1 later, cc @tomponline

@hamistao hamistao marked this pull request as ready for review January 14, 2025 09:07
@tomponline
Copy link
Member

We have reached a crossroads when it comes to what value to use to represent the size of unbounded (sizeless) volumes.

Previously, we got size 0 in every scenario except when a user explicitely set the volume size, either through lxc config device set c1 root size=10GiB, for instance volumes, or lxc storage volume set pool custom_vol size 10GiB. Besides disregarding the size that block-based volumes need to have, this also doesn't consider the fact that block-based instances are not live-resizable, and such getting a non-zero could also be innacurate. Also, when getting usage was not supported, the API used to show an empty state struct for the instance disk, so no value was designed for usage nor total.

Mind that these 0's were never exposed though CLI command outputs, so lxc storage volume info pool volume, lxc info instance and lxc config show instance would simply abstain from mentioning the disk/volume total size.

Since #14511, we now show -1 for the total size of sizeless instances (not custom volumes yet) and -1 for usage when it is not supported. We also now show the default block size for block-based instances (10GiB). Custom volumes still behave the same as they did previously.

So now we have two possible ways to think of the size of sizeless volumes. A) The 0 size for sizeless volumes we got previously was not a bug, but only the innacurate reporting of the size. So we should change it back to 0. B) We never should show 0 as a size for a volume since the correct way to represent no size is by using -1. So showing 0 was never correct. So we should fix custom volumes accordingly.

One one hand, option A implies we should be "changing less" and also takes into account the possibility of some users relying on this 0 to check wether the instance has a root disk size set or not. It is also more flexible in case we want to make it a uunsigned integet in the future. On the other, B implies we should be "changing more" and using -1. This is more consisten with our API. As examples, we have cpu usage and project limits defaulting to -1 on the API level. Also, although changing to -1 can be seen as an API break, the disk size reported by the API was extremely unreliable, so if one was almost always getting 0 size, it is hard to say that this is meaningful/useful information to begin with. And this is why I am more inclined to go with B and keep it as -1.

I hope this was clear, if any info is missing we can discuss it in our 1-1 later, cc @tomponline

@hamistao as discussed in our 1:1 today, we tested the current state APIs for disk usage for instances and custom volumes as follows:

Instance state:

lxc query /1.0/instances/c1/state

dir: "disk": {},
zfs:	"disk": {
    	"root": {
        	"total": 0,
        	"usage": 7117312
    	}
	},
btrfs: 	"disk": {},
lvm: 	"disk": {
    	"root": {
        	"total": 0,
        	"usage": 1182437376
    	}
	},

Custom volume state:

lxc query /1.0/storage-pools/default/volumes/custom/vol1/state

dir: Error: Not supported
zfs: {
	"usage": {
    	"total": 0,
    	"used": 24576
	}
}
btrfs: Error: Not supported
lvm: Error: Not supported

So it looks like we do return total: 0 for ZFS when the volume is unbounded.

So we agreed to maintain the existing premise that unbounded volumes should return a size of 0.

We can use -1 for total size in an error scenario if needed. We see this pattern already for CPU usage and processes fields.

hamistao added a commit to hamistao/lxd that referenced this pull request Jan 21, 2025
…limits"

As per the discussion outlined on canonical#14595 (comment), we are sticking with using 0 to represent unbounded volumes.

This reverts commit 26b8d73.

Signed-off-by: hamistao <[email protected]>
hamistao added a commit to hamistao/lxd that referenced this pull request Jan 21, 2025
…limits"

As per the discussion outlined on canonical#14595 (comment), we are sticking with using 0 to represent unbounded volumes.

This reverts commit 26b8d73.

Signed-off-by: hamistao <[email protected]>
hamistao added a commit to hamistao/lxd that referenced this pull request Jan 21, 2025
…limits"

As per the discussion outlined on canonical#14595 (comment), we are sticking with using 0 to represent unbounded volumes.

This reverts commit 26b8d73.

Signed-off-by: hamistao <[email protected]>
hamistao added a commit to hamistao/lxd that referenced this pull request Jan 21, 2025
…limits"

As per the discussion outlined on canonical#14595 (comment), we are sticking with using 0 to represent unbounded volumes.

This reverts commit 26b8d73.

Signed-off-by: hamistao <[email protected]>
hamistao added a commit to hamistao/lxd that referenced this pull request Jan 21, 2025
…limits"

As per the discussion outlined on canonical#14595 (comment), we are sticking with using 0 to represent unbounded volumes.

This reverts commit 26b8d73.

Signed-off-by: hamistao <[email protected]>
All the handling of `volatile.rootfs.size` for images considers that it can be a '123 GiB' type string as well, so this also considers these strings when validating the key

Signed-off-by: hamistao <[email protected]>
This makes it so that the an instance volume's `volatile.rootfs.size` config is aligned with the volume's actual size in any point in time.
i.e. some configs like disk device `size` and pool's `volume.size` can propagate into the volume's `volatile.rootfs.size`, similarly to what happens with custom volumes.

Signed-off-by: hamistao <[email protected]>
For instance volumes that contain `volatile.rootfs.size`, derive disk
size from this config key. For instances that precede the usage of thi
config key on instances, use the old method as to avoid breaking the API
when getting the size of those instances.

Signed-off-by: hamistao <[email protected]>
@github-actions github-actions bot removed Documentation Documentation needs updating API Changes to the REST API labels Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants